AITopics | safe harbor

Collaborating Authors

safe harbor

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

In-House Evaluation Is Not Enough: Towards Robust Third-Party Flaw Disclosure for General-Purpose AI

Longpre, Shayne, Klyman, Kevin, Appel, Ruth E., Kapoor, Sayash, Bommasani, Rishi, Sahar, Michelle, McGregor, Sean, Ghosh, Avijit, Blili-Hamelin, Borhane, Butters, Nathan, Nelson, Alondra, Elazari, Amit, Sellars, Andrew, Ellis, Casey John, Sherrets, Dane, Song, Dawn, Geiger, Harley, Cohen, Ilona, McIlvenny, Lauren, Srikumar, Madhulika, Jaycox, Mark M., Anderljung, Markus, Johnson, Nadine Farid, Carlini, Nicholas, Miailhe, Nicolas, Marda, Nik, Henderson, Peter, Portnoff, Rebecca S., Weiss, Rebecca, Westerhoff, Victoria, Jernite, Yacine, Chowdhury, Rumman, Liang, Percy, Narayanan, Arvind

arXiv.org Artificial IntelligenceMar-21-2025

The widespread deployment of general-purpose AI (GPAI) systems introduces significant new risks. Yet the infrastructure, practices, and norms for reporting flaws in GPAI systems remain seriously underdeveloped, lagging far behind more established fields like software security. Based on a collaboration between experts from the fields of software security, machine learning, law, social science, and policy, we identify key gaps in the evaluation and reporting of flaws in GPAI systems. We call for three interventions to advance system safety. First, we propose using standardized AI flaw reports and rules of engagement for researchers in order to ease the process of submitting, reproducing, and triaging flaws in GPAI systems. Second, we propose GPAI system providers adopt broadly-scoped flaw disclosure programs, borrowing from bug bounties, with legal safe harbors to protect researchers. Third, we advocate for the development of improved infrastructure to coordinate distribution of flaw reports across the many stakeholders who may be impacted. These interventions are increasingly urgent, as evidenced by the prevalence of jailbreaks and other flaws that can transfer across different providers' GPAI systems. By promoting robust reporting and coordination in the AI ecosystem, these proposals could significantly improve the safety, security, and accountability of GPAI systems.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2503.16861

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Europe > United Kingdom (0.14)
North America > United States > New York > New York County > New York City (0.04)
(10 more...)

Genre: Research Report (1.00)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
(3 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(2 more...)

Add feedback

A Safe Harbor for AI Evaluation and Red Teaming

Longpre, Shayne, Kapoor, Sayash, Klyman, Kevin, Ramaswami, Ashwin, Bommasani, Rishi, Blili-Hamelin, Borhane, Huang, Yangsibo, Skowron, Aviya, Yong, Zheng-Xin, Kotha, Suhas, Zeng, Yi, Shi, Weiyan, Yang, Xianjun, Southen, Reid, Robey, Alexander, Chao, Patrick, Yang, Diyi, Jia, Ruoxi, Kang, Daniel, Pentland, Sandy, Narayanan, Arvind, Liang, Percy, Henderson, Peter

arXiv.org Artificial IntelligenceMar-7-2024

Independent evaluation and red teaming are critical for identifying the risks posed by generative AI systems. However, the terms of service and enforcement strategies used by prominent AI companies to deter model misuse have disincentives on good faith safety evaluations. This causes some researchers to fear that conducting such research or releasing their findings will result in account suspensions or legal reprisal. Although some companies offer researcher access programs, they are an inadequate substitute for independent research access, as they have limited community representation, receive inadequate funding, and lack independence from corporate incentives. We propose that major AI developers commit to providing a legal and technical safe harbor, indemnifying public interest safety research and protecting it from the threat of account suspensions or legal reprisal. These proposals emerged from our collective experience conducting safety, privacy, and trustworthiness research on generative AI systems, where norms and incentives could be better aligned with public interests, without exacerbating model misuse. We believe these commitments are a necessary step towards more inclusive and unimpeded community efforts to tackle the risks of generative AI.

ai evaluation, harbor, safe harbor, (12 more...)

arXiv.org Artificial Intelligence

2403.04893

Country:

Europe > United Kingdom (0.14)
North America > Canada (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(7 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Law > Intellectual Property & Technology Law (1.00)
Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.79)

Add feedback

Technology Helps Ensure There's No Safe Harbor for War Criminals

#artificialintelligenceApr-13-2022, 14:45:59 GMT

In its effort to ensure there is no hiding place in the United States for war criminals, genocidaires and other human rights abusers, U.S. Immigration and Customs Enforcement has sought to harness the power of innovation, employing automated facial recognition technology and clever software algorithms to identify perpetrators who might be in, or be traveling to, America, officials told AFCEA's 2021 Federal Identity Forum and Expo Tuesday. War Crimes Hunter (WCH) is a series of customized reusable software tools built by the ICE Homeland Security Investigations (HSI) Innovation Lab in Crystal City, Virginia. It's used by HSI investigators in the Human Rights Violators and War Crimes Unit to try and identify suspected war criminals or other human rights violators. WCH automates the repetitive administrative work, while leaving key decisions to human analysts, explained Amy Nunes, a section chief in the unit. "We automate what we can automate as much as possible, still keeping an [human] analyst in the loop, because we didn't want to risk getting a bunch of [false positives or junk data] that we didn't need," she said.

artificial intelligence, bentall, investigator, (10 more...)

#artificialintelligence

Country: North America > United States > Virginia (0.25)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Government > Immigration & Customs (1.00)

Technology: Information Technology > Artificial Intelligence > Vision > Face Recognition (0.37)

Add feedback

The Winner Takes All: Recipe for Disaster - Netopia

#artificialintelligenceJul-14-2017, 20:00:27 GMT

In the third decade of the commercial internet, concentration of power and money is greater than ever. Will this process stop or reverse? Or are we heading for a future of even stronger corporate dominance? Netopia talked to Jonathan Taplin, author of Move Fast and Break Things – a book which takes a closer look at the ideology and business of Silicon Valley's internet skyscrapers. Per Strömbäck: Is the "do first, ask later"-ideology the key to Silicon Valley's success?

artificial intelligence, jonathan taplin, social media, (13 more...)

#artificialintelligence

Country: North America > United States > California (0.48)

Industry:

Information Technology > Services (0.53)
Transportation > Ground > Road (0.33)

Technology:

Information Technology > Communications > Social Media (0.61)
Information Technology > Artificial Intelligence (0.53)

Add feedback